# Unsupervised Learning

Cityscapes Semantic Eomt Large 1024
MIT
This model reveals the potential of Vision Transformer (ViT) in image segmentation tasks by transforming ViT into an efficient image segmentation model through specific methods.
Image Segmentation
C
tue-mps
85
0
Chronodepth V1
MIT
ChronoDepth is a temporally consistent video depth learning method based on video diffusion priors, capable of learning and predicting depth information from videos.
3D Vision
C
jhshao
28
1
Dust3r ViTLarge BaseDecoder 512 Linear
DUSt3R is a deep learning model for generating 3D geometric models from images, capable of easily handling geometric 3D vision tasks.
3D Vision Safetensors
D
naver
313
0
Dust3r ViTLarge BaseDecoder 224 Linear
DUSt3R is a model for easily achieving geometric 3D vision from images, capable of reconstructing 3D scenes from single or multiple images.
3D Vision Safetensors
D
naver
1,829
0
Dpt Dinov2 Giant Kitti
Apache-2.0
DPT framework using DINOv2 as the backbone network for depth estimation tasks.
3D Vision Transformers
D
facebook
56
0
Dpt Dinov2 Large Kitti
Apache-2.0
This model employs the DPT framework with DINOv2 as the backbone network, focusing on depth estimation tasks.
3D Vision Transformers
D
facebook
26
2
Dpt Dinov2 Base Nyu
Apache-2.0
A DPT model using DINOv2 as the backbone network for depth estimation tasks.
3D Vision Transformers
D
facebook
146
0
Umt5 Xxl
Apache-2.0
UMT5 is a multilingual text generation model pretrained on the mC4 multilingual corpus, supporting 107 languages and optimized for language balance using the UniMax sampling strategy
Large Language Model Transformers Supports Multiple Languages
U
google
4,449
32
Umt5 Xl
Apache-2.0
A multilingual text generation model pretrained on the mC4 multilingual corpus, supporting 107 languages
Large Language Model Transformers Supports Multiple Languages
U
google
1,049
17
Umt5 Small
Apache-2.0
A unified multilingual T5 model pre-trained on the mC4 multilingual corpus, covering 107 languages
Large Language Model Transformers Supports Multiple Languages
U
google
17.35k
23
Tat Model
This is a sentence embedding model based on sentence-transformers, capable of mapping sentences and paragraphs into a 768-dimensional vector space, suitable for tasks such as sentence similarity calculation and semantic search.
Text Embedding
T
mathislucka
22
0
Congen TinyBERT L4
Apache-2.0
A sentence embedding model based on ConGen, capable of mapping sentences to a 312-dimensional vector space, suitable for tasks like semantic search.
Text Embedding Transformers
C
kornwtp
13
1
Sup SimCSE VietNamese Phobert Base
SimeCSE_Vietnamese is a Vietnamese sentence embedding model based on SimCSE, using PhoBERT as the pretrained language model, suitable for both unlabeled and labeled data.
Text Embedding Transformers Other
S
VoVanPhuc
25.51k
22
Bimeanvae Amzn
Bsd-3-clause
BiMeanVAE is a model based on Variational Autoencoder (VAE), primarily used for text summarization tasks.
Text Generation Transformers English
B
megagonlabs
85
0
Gpt2 Chinese Cluecorpussmall
Chinese GPT2-distil model, pretrained on CLUECorpusSmall dataset, suitable for Chinese text generation tasks
Large Language Model Chinese
G
uer
41.45k
207
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase